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ABSTRACT 


Machine Learning is part of Artificial Intelligence that has the ability to 
make future forecastings based on the previous experience. Methods has 
been proposed to construct models including machine learning algorithms 
such as Neural Networks (NN), Support Vector Machines (SVM ) and Deep 
Learning. This paper presents a comparative performance of Machine 
Learning algorithms for cryptocurrency forecasting. Specifically, this paper 
concentrates on forecasting of time series data. SVM has several advantages 
over the other models in forecasting, and previous research revealed that 
SVM provides a result that is almost or close to actual result yet also improve 
the accuracy of the result itself. However, recent research has showed that 
due to small range of samples and data manipulation by inadequate evidence 
and professional analyzers, overall status and accuracy rate of the forecasting 
needs to be improved in further studies. Thus, advanced research on the 
accuracy rate of the forecasted price has to be done. 
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1. INTRODUCTION 

Forecasting future values or price of experimental time series plays a vital role in almost all fields of 
studies including economics, science and engineering, finance, business, meteorology and 
telecommunication [1]. Cryptocurrency, an alternative medium of exchange consisting of over 1441 (as of 
January 2018) decentralized crypto coin types. Relating machine learning algorithms to cryptocurrency is 
considered as a new field with limited research studies. In general, system can be used to any directive 
machine learning problem, in return the system will provide a description relevant to samples both in and out 
of the dataset. 

There are numerous type of cryptocurrency including Bitcoin, Litecom, Ethereum, Nem, Ripple, 
Iota, Stellar and others. The cryptographic foundation of each crypto coin makes them vital. Considering the 
exchange rates of cryptocurrencies are notorious for being volatile, we attempt to model an algorithm that 
can be used in trading of numerous cryptocurrencies. In order to show the accuracy rate of the predicted price 
of the proposed methodology, two different data are used as explanatory examples. The comparative 
cryptocurrencies are Litecoin and Ethereum, Bitcoin, Stellar, Ripple and Nem. This paper uses the mean 
absolute percentage error (MAPE) calculation to evaluate the proposed models. 

The outline of this paper is as follows. Section | introduces some basic notions of cryptocurrencies 
and machine learning algorithms. Section 2 discusses the type of cryptocurrency and two largest alt ernative 
blockchain technologies, Litecoin (LTC) and Ethereum (XRP) and the purposes of each development. 
Section 3 presents about machine learning algorithms and three most widely used algorithms, Artificial 


Journal homepage: http://iaescore.com/journals/index.php/ijeecs 


122 O ISSN: 2502-4752 


Neural Networks (ANN) and Support Vector Machines (SVM) and Deep Learning. Section 4 explains the 
experiments and results of experiments using all models. 


1.1. Cryptocurrency 

Litecom (LTC) and Ethereum (XRP) are among the largest alternative blockchain technologies, 
known as altcoms and were invented after Bitcoin (BTC). Altcoins may have different purposes of 
development but are using general methodology based on decentralized P2P network, with the assumption of 
no network failure and no Internet interruption [2-5]. Research on the cryptocurrency field is still limited. 
Mostly, research in this field 1s focusing on a single cryptocurrency rather than broader areas such as 
technological advancement, government participation in market regulations as well as_ market 
development [6]. This section will focus on six types of cryptocurrency begins with Bitcoin, Ethereum, 
Litecoin, Nem, Ripple followed by Stellar. In the succeeding section, we focus the review of previous studies 
on Machine Learning, Support Vector Machines (SVM), Artificial Neural Networks (ANNs) and Deep 
Leaming applied in forecasting. 

A peer to peer (p2p) payment cash system, non regulated digical currency and introduced in 2008 
with no legal status tendered is known as Bitcoin. It is called as one type of cryptocurrencies with its 
cryptographic function in its security of creation and money transfer. In recent years, bitcoin turns out to be 
the most well known currency in the area of volume trading, thus makes a Bitcoin as the most potential 
financial medium for investors [7]. It locks the transaction as the individualities of the sender, receiver and 
the volume of transaction are all encrypted [6]. 

Ethereum (XRP) is a decentralized block-chain based technology that runs Turing-complete to build 
and execute smart contracts or circulated systems [8-9]. The value of its com is called ether. It was 
introduced by Vitalik Buterin in 2013 and funded a year later amounted US$18 million worth of bitcoins, 
raised through online public crowd sale [8]. Ether has no boundaries on its circulation, can be traded in 
cryptocurrency exchanges, not to be one of the payment system but it’s intention is merely to be used in the 
Ethereum network [1, 9]. 

Litecoin (LTC) was released in October 2011 using a similar technology to Bitcoin, and invented by 
Charles Lee. The block generation time is decreased as much as 4 times per block (from 10 minutes to 2.5 
minutes per block) 84 million of maximum limit, it is equivalent to 4 times higher than Bitcoin and has 
adopted a different hashing algorithm [9-10]. Litecoin is considered as the ‘silver standard’ of crypto coin 
and turn into a second most accepted by both miners and exchanges [9]. It uses Scrypt encryption algorithm 
and contradicts to SHA-256 and developed to bid the Bitcoin network transaction confirmation speed and 
uses an algorithm that was resilient to the advancement of hardware mining technologies. 

NEM is a blockchain notarization also known as a peer-to-peer platform that provides services like 
online payment and messaging system. Having a conjointly owned notarization, it then makes NEM to 
become as the first public/private blockchain combination [8]. 

Ripple, an open source digital currency, produced by Jed McCaleb and partner, Chris Larsen, a 
distributed peer-to-peer network payment medium controlled and managed by a single organization and 
offers another medium of security mechanism [6, 8]. The development of Ripple is based on Byzantine 
Consensus Protocol and maximum number of Ripple is 100 million [8]. 

Stellar, like Ripple offers and entire substitute of security mstrument and implemented based on 
Byzantine Consensus Protocol. Stellar has implemented a new technology to process the financial 
transactions including open source, scattered and unlimited ownership [6, 11]. 


1.2. Machine Learning 

To succeed on trading, mastering analysis is very important. Future value can be analyzed in two 
different ways, technical analysis and fundamental analysis. Technical analysis uses trading information from 
the market information, such as price, trading volume to forecast future price while other uses the 
information outside the market like economic situation, interest rate and geopolitical issues to forecast future 
direction [11]. Many investors focus on technical while some focus fundamental. However, there are some 
investors who focus on overlaps between fundamental as well as technical. This paper will present about 
technical analysis by applying the machine learning algorithms. Machine learning has been established as a 
serious model in classical statistics in the forecasting world for over more than two decades [1], [12]. Two 
most widely used algorithms for forecasting price movement are known as Artificial Neural Networks 
(ANNs) and Support Vector Machine (SVM) and both has own patterns of learning [11, 13]. ANNs has been 
widely used for prediction in securities. Number of issues in ANNs has been discussed by researchers 
mcluding the selection of parameters and training set [14]. According to [1], the embedding formu lation 
recommends that when a historical dataset S is available, the one-step forecasting can be considered as 
supervised learning. Supervised learning is the task of deriving a function from training data consist of a set 
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of training dataset. It comes in a set of input and output variables that is also considered as dependent on the 
inputs. One-step forecasting can be applied when a mapping model is exist [1]. In one-step forecasting, the 
previous values of the series, n are available, thus forecasting can be performed as a generic regression 
problem as Figure 1. General approach to model an input/output sense, relies on the accessibility of 
experimental pairs and denoted as traming set. Training set is initiated by the historical series S by creating 
the [(N —n -1) x n| mput data matrix. 

In one step forecasting, the approximator “f returns the prediction of the value of the time series at 
time t + | as a function of the n previous values (the rectangular box containing z-1 represents a unit delay 
operator, 1e., yt-1 = z-1 yt) [1]. 

And the [(N —n 1) x 1] output vector 


tN 
uN _-1 


n+] 


(1) 


For the sake of simplicity, a is assume as d = OQ lag time. Henceforth, in this chapter we will refer to 
the ith row of X, which is essentially a temporal pattern of the series, as to the (reconstructed) state of the 
Series at time t—1+ 1. 
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Figure 1. Proposed Methodology 


1.2.1. Support Vector Machine (SVM) 

Support Vector Machine (SVM) method or classifier was introduced as an induction principle that 
can avoid over-fitting the data at the assimilation of the training dataset [15] and is known as the most 
flexible technique to construct the explicit and accurate boundaries [16], [17]. SVM works very well in 
various applications, provide fast traming result and easy to use [18]. Eventually, SVM has been invented to 
answer pattern recognition problems to fault diagnosis problems [15, 19]. It gives nonlinear and solid 
solution by applying kernel functions to map the input space into a higher dimensional feature [20]. There are 
many benefits of the SVM including outperforms in generalization model and perform well with small 
datasets. SVM creates a lot of benefits in many fields including pattern classification problem [14]. Besides, 


SVM is to produce a classification hyper-plane that differentiate two classes of data with maximum margin. 
Standard SVM model is as follows: 
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Another important point of discussion is the options offered by type of SVM. SVM offers linear and 
nonlinear type of models. Linear SVMs outperforms the nonlinear in terms of speed and execution time, but 
underperform dealing with complex datasets contains many training examples but less features. While 
nonlinear SVMs although losing its explanatory power, seems to perform steadily across various problems, 
and becomes most preferred choice compared to linear SVMs [18]. 


1.2.2. Artificial Neural Networks (ANNs) 

A common neural network that is doing the deep learning at its hidden layers 1s called an artificial 
neural networks [21]. Standard ANNs comprises of input layer, hidden layers and output layer [22]. It is an 
extremely similar system consisting interrelated and interacting processing nodes or neurons [23, 23], works 
hke a human brain and process the information by interacting with a numbers of straightforward processing 
features [23]. There are input and output neurons in this environment where input neurons will be triggered 
upon instruments sensing the environment. While other neurons trigger through weighted connections from 
neurons which was activated earlier, some neurons could effect the environment by activating actions [24]. 
Depending on the issue and how neurons are linked, such behavior may need a long connecting chains of 
computational phases where each phase revises the aggregate activation of the network. 


1.2.3. Deep Learning (DL) 

Deep Learning is considered as a diverse methods in neural networks [25] and primarily to get the 
most precise result across many phases, as shown in Table | [24]. DL is capable to produce influencing 
results based on multiple layer extraction [25]. Models explained in this section applies a non-linear function 
on the hidden units and enables a more lavish model that is capable to learn more abstract illustrations to 
form a deep network when modules are arranged on top of each other [26]. The goal of deep network is to 
design structures at the lower layers that will separate the variation factors in the input data ad chain the 
representations at the higher layers, but the drawbacks of the training with multiple hidden layer units lies in 
the event of the error signal being backpropagated [26]. 


Table 1. Variable Description 


Variable Description 
Open Price The first price of a given cryptocurrency in a daily trading 
Close Price The price ofthe last transaction for a given cryptocurrency at the end of a daily trading 
High Price The highest price that was paid for a cryptocurrency during a daily trading 
Low Price The lowest price of a cryptocurrency reached in a daily trading 


2. PROPOSED METHODOLOGY 

In this paper, we consider time series data based on 5 years of daily history, as inputs for all models 
and may vary based on the availability of datasets from the source. The data is prepared from daily open, 
close, high and low price of a daily trading for all total of six types of cryptocurrencies and are downloaded 
from the market capitalization database and range from 2013 through 2018. 


2.1. Data Description 

Our main purpose of this paper is to get the most accurate forecasting price, based on the above 
mentioned methods. Bitcomin, BTC is the first digital currency in market capitalization list and begins since 
March 2013 through January 2018. Training data for bitcoin starts from 28th March 2013 to 16th until 
January 2017, followed by Ethereum from 7th August, 2015 to 16th January, 2017, Litecom from 28th April 
2013 to 16th January 2017, Nem Ist April 2015 to 16th January 2017, Ripple 4th August, 2015 through 16th 
January, 2017 and Stellar from 4th August 2013 to 16th January 2017. While testing data starts for all 
selected type of cryptocurrencies start from 17th January, 2017 through 16th January 2018 subsequently. 

Table 2 The training and testing dataset in our time series data. The first part is the traming set 
(number of values as per #Observations ) in the first segment, accordingly. Several classifiers are then used to 
predict the test data (number of values in the testing set is = 364) in the second segment. 


Indonesian J Elec Eng & Comp Sci, Vol. 11, No. 3, September 2018 : 1121 — 1128 


Indonesian J Elec Eng & Comp Sci ISSN: 2502-4752 O 1125 


Table 2. The training and testing dataset in our time series data 


Ga siecicnes Nae Training Data Test Data 
YP y From To #Observations From To #Observations 
see 28-Mar- 
piCOM ee 13 16-Jan-17 1388 17-Jan-17 16-Jan-18 364 
Ether or 
“Ethereum”, ETH 7-Aug-15. 16-Jan-17 526 17-Jan-17 16-Jan-18 364 
Litecoin, LTC 28-Apr-13. 16-Jan-17 1358 17-Jan-17 16-Jan-18 364 
Nem, XEM 11-Apr-15 = 16-Jan-17 657 17-Jan-17 16-Jan-18 364 
Ripple, XRP 4-Aug-13 16-Jan-17 1262 17-Jan-17 16-Jan-18 364 
Stellar, XLM 5-Aug-14 16-Jan-17 896 17-Jan-17 16-Jan-18 364 


3. RESULTS AND ANALYSIS 

The result section begins by showing performance measures for each cryptocurrency types 
according to classifiers. These serve as acontrol for the rest of the discussion. The analysis is separated into 
two different experiments: 1) Performance measures by various classifiers 11) Forecasted cryptocurrency value 
by machine learning algorithms vs actual value. Table 3 shows the performance accuracy in correspondence 
to four classifiers on the cryptocurrency market capitalization. The maximum value is 95.5%, which means 
that any alphas over 95.5% have p-value of 0.01 or less. 


Table 3. Performance Measures by various classifiers 
Performance Accuracy (%) 


camila Bitcoin Ethereum Litecoin Nem Ripple Stellar 
SVM 78.90 95.50 82.40 47.70 70.00 58.70 
ANNs 79.40 78.00 75.80 77.80 81.40 89.80 
DL 61.90 69.40 62.80 57.20 60.90 70.70 
BoostedNN 81.20 81.60 72.20 77.40 81.50 92.80 


Several different classifiers were trained with the same set of features. In this case, the datasets were 
evaluated using classification accuracy. The comparison of all classifiers generated by different methods are 
based on the same dataset. Thus it will be fair for all classifiers to perform the testing and training. 

The results for the classifiers with the best performance on the test set are testified. The results show 
that SVM classifier works well for Ethereum followed by Litecoin. While, ANN is seen works best for 
Bitcoin followed by Nem. Ripple and Stellar has the best performance accuracy for BoostedNN. However, 


among all, SVM classifier performs the best compared to the other classifiers with the performance accuracy 
of 95.5%. 

For comparability, same data sets and period of 364 days were chosen for all classifiers. 
Performance can be seen in Figure 2-7. The SVM significantly outperformed the other classifiers. This result 
is further explored using mean absolute percentage error (MAPE) calculation. SVM mean absolute 
percentage error is 0.31% and is the lowest MAPE. Thus, the SVM is considered as reliable forecasting 
model for these sixselected cryptocurrency. 
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Figure 2. SVM value is comparable to actual Bitcoin for the period from 17/1/2017 to 
16/1/2018 
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Figure 3. SVM value is comparable to actual Litecoin for the period from 17/1/2017 to 
16/1/2018 
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Figure 4. SVM value is comparable to actual Ripple for the period from 17/1/2017 to 
16/1/2018 
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Figure 5. SVM value is comparable to actual Ethereum for the period from 17/1/2017 to 
16/1/2018 
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Figure 6. SVM value is comparable to actual Nem for the period from 17/1/2017 to 16/1/2018 
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Figure 7. SVM value is comparable to actual Stellar for the period from 17/1/2017 to 
16/1/2018 


4. CONCLUSION 

The paper is highly focuses on the comparative performance of machine learning algorithms of six 
cryptocurrencies. To begin with, the review of cryptocurrency has covered six major cryptocutrency, there 
are Bitcoin, Ethereum, Litecoin, Nem, Ripple and Stellar. Further, previous studies on Machine Learning, 
Support Vector Machines (SVM), Artificial Neural Networks (ANNs) and Deep Learning forecasting has 
been explored. 

Firstly, the performance measures were done to get the accuracy of classifiers over the selected 
cryptocurrency and obtained the result as in Figure 3. Result shows that SVM outperformed other classifiers 
with the accuracy of 95.5%. It is realized, that the quality of training data and population of dataset plays an 
important role for a successful prediction. 

Secondly, the forecasted cryptocurrency value by Machine Learning vs actual value of 
cryptocurrency were then analyzed. From the comparative analysis done in this section, SVM has a 
comparable values for all cryptocurrency for the period from 17/1/2017 to 16/1/2018. 

Moreover, the result is further explored using mean absolute percentage error (MAPE) calculation. 
The results show that SVM has the lowest value of MAPE. Thus, the SVM is considered as a reliable 
forecasting model for the selected cryptocurrency. 

In future, the algorithm will be improved on the accuracy rate of the forecasted price. Besides, with 
the power of SVM, future work will be done to further optimize the SVM to get the most accurate result as 
per actual value of cryptocurrency. 
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